Server allowing JOIN on NULL values in certain cases if query includes ORDER BY clause

Bug #586051 reported by Patrick Crews
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Drizzle
Fix Released
Medium
PrafullaT
Dexter
Fix Released
Medium
PrafullaT

Bug Description

The server is allowing a JOIN on NULL values for certain cases if an ORDER BY clause is included.
From the test case:
SELECT table2 .`col_int` field3 FROM i table1 JOIN b table2 ON table1 .`col_int_key` = table2 .`col_int_key` WHERE table1 .`pk` IN ( 4 ) ORDER BY field3

vs.

SELECT table2 .`col_int` field3 FROM i table1 JOIN b table2 ON table1 .`col_int_key` = table2 .`col_int_key` WHERE table1 .`pk` IN ( 4 ) /* TRANSFORM_OUTCOME_UNORDERED_MATCH */

Produces this diff of result sets:
# 2010-05-26T15:38:30 --- /tmp//randgen10880-1274902710-server0.dump 2010-05-26 15:38:30.930254321 -0400
# 2010-05-26T15:38:30 +++ /tmp//randgen10880-1274902710-server1.dump 2010-05-26 15:38:30.930254321 -0400
# 2010-05-26T15:38:30 @@ -1 +0,0 @@
# 2010-05-26T15:38:30 -7

Here we can see what happened (the full result row):
SELECT table2 .`col_int` field3, table1.col_int_key, table2.col_int_key, table1.pk FROM i table1 JOIN b table2 ON table1 .`col_int_key` = table2 .`col_int_key` WHERE table1 .`pk` IN ( 4 ) ORDER BY field3;
+--------+-------------+-------------+----+
| field3 | col_int_key | col_int_key | pk |
+--------+-------------+-------------+----+
| 7 | NULL | NULL | 4 |
+--------+-------------+-------------+----+

That JOIN should not have happened. If we don't have the ORDER BY clause, the query properly returns 0 rows.

EXPLAIN output:
query with ORDER BY
# |1|SIMPLE|table1|const|PRIMARY,col_int_key|PRIMARY|4|const|1|Using filesort|
# |1|SIMPLE|table2|ref|col_int_key|col_int_key|5|const|1|Using where|
# */

query without ORDER BY
# |1|SIMPLE|table1|const|PRIMARY,col_int_key|PRIMARY|4|const|1||
# |1|SIMPLE|table2|ref|col_int_key|col_int_key|5|const|1||
# */

Related branches

Revision history for this message
Patrick Crews (patrick-crews) wrote :

Test case:
Run and observe the difference in result sets between query 0 and query 1

#/* Server0: Drizzle 2010.05.1561 */

--disable_warnings
DROP TABLE IF EXISTS `i`;
CREATE TABLE `i` (
  `col_int_key` int DEFAULT NULL,
  `pk` int NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`pk`),
  KEY `col_int_key` (`col_int_key`)
) ENGINE=InnoDB;

ALTER TABLE `i` DISABLE KEYS;
INSERT INTO `i` VALUES (-480247808,1),(-1066663936,2),(NULL,3),(NULL,4),(-1583808512,5),(NULL,6),(NULL,7),(NULL,8),(0,9),(-219152384,10);
ALTER TABLE `i` ENABLE KEYS;

DROP TABLE IF EXISTS `b`;
CREATE TABLE `b` (
  `col_int_key` int DEFAULT NULL,
  `col_int` int DEFAULT NULL,
  `pk` int NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`pk`),
  KEY `col_int_key` (`col_int_key`)
) ENGINE=InnoDB;

ALTER TABLE `b` DISABLE KEYS;
INSERT INTO `b` VALUES (NULL,7,1);
ALTER TABLE `b` ENABLE KEYS;
--enable_warnings

#/* Begin test case for query 0 */

SELECT table2 .`col_int` field3
FROM i table1 JOIN b table2 ON table1 .`col_int_key` = table2 .`col_int_key`
WHERE table1 .`pk` IN ( 4 )
ORDER BY field3 ;
#/* End of test case for query 0 */

#/* Begin test case for query 1 */

SELECT table2 .`col_int` field3
FROM i table1 JOIN b table2 ON table1 .`col_int_key` = table2 .`col_int_key`
WHERE table1 .`pk` IN ( 4 ) /* TRANSFORM_OUTCOME_UNORDERED_MATCH */ ;

#/* End of test case for query 1 */

DROP TABLE i;
DROP TABLE b;

Revision history for this message
Patrick Crews (patrick-crews) wrote :

I have seen similar behavior in MySQL, but not this exact case (or at least I can't find the bug if it exists).

A similar bug is:
http://bugs.mysql.com/bug.php?id=48916 - Server incorrectly processing HAVING clauses with an ORDER BY clause

It was a similar situation - the addition of the ORDER BY produced result sets for a HAVING clause that should have had 0 rows.

Revision history for this message
Patrick Crews (patrick-crews) wrote :

NOTE:
Using DISTINCT in such queries (with the ORDER BY / buggy queries) will result in the correct result sets (0 rows)

Changed in drizzle:
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
Patrick Crews (patrick-crews) wrote :

Also see MySQL Bug http://bugs.mysql.com/bug.php?id=52636
6.0 allowing JOINs on NULL values w/ optimizer_join_cache_level = 5-8

Revision history for this message
Patrick Crews (patrick-crews) wrote :

Bug not present in MySQL 5.1

Revision history for this message
PrafullaT (prafulla-t) wrote :

Analysis:

Fix for Bug 586051

1. test_if_ref method which checks whether predicate is already evaluated
   due to ref/eq_ref access or not was incorrectly removing a predicate
   that was not implicitly evaluated due to ref access (due to presence of filesort ?)
   It was field=NULL predicate.
   Such predicate should be kept and execution engine will filter out rows
   correctly. Removal of such predicate led to returning of rows which had
   NULLs for join/predicate columns.
2. field COMP_OP NULL will always false for all fields except when COMP_OP
   is NULL-safe equality operator. Modified range optimizer to return zero
   row count in such cases.
   Query now does not even run. It returns zero result. As such Fix(1) is not
   required but we might hit that case in some other query (I have not tried it
   yet)
3. Fixed Field::val_str to print "NULL" for literal NULL instead of "0". It
   added lot of confusion while debugging. This has led to some result mismatch
   for subselect.test and might also show up in other tests . We need to update those
   results.

Revision history for this message
PrafullaT (prafulla-t) wrote :

Hi Patrick,
I have pushed fix for this in a branch
http://bazaar.launchpad.net/~prafulla-tekawade/drizzle/drz-bugfix-586051

I am expecting some result mismtaches for some of the tests due to (3) above
Could you please run your regular tests and update those results and if you find
everything ok, propose this one for merge ?

PrafullaT (prafulla-t)
Changed in drizzle:
assignee: nobody → Prafulla Tekawade (prafulla-tekawade)
Revision history for this message
Patrick Crews (patrick-crews) wrote : Re: [Bug 586051] Re: Server allowing JOIN on NULL values in certain cases if query includes ORDER BY clause

Hi Prafulla,

I'll take a look at this patch today. Thanks for looking at this!

On Fri, Aug 6, 2010 at 7:42 AM, Prafulla Tekawade <<email address hidden>
> wrote:

> ** Changed in: drizzle
> Assignee: (unassigned) => Prafulla Tekawade (prafulla-tekawade)
>
> --
> Server allowing JOIN on NULL values in certain cases if query includes
> ORDER BY clause
> https://bugs.launchpad.net/bugs/586051
> You received this bug notification because you are a direct subscriber
> of the bug.
>
> Status in A Lightweight SQL Database for Cloud and Web: Confirmed
>
> Bug description:
> The server is allowing a JOIN on NULL values for certain cases if an ORDER
> BY clause is included.
> >From the test case:
> SELECT table2 .`col_int` field3 FROM i table1 JOIN b table2 ON table1
> .`col_int_key` = table2 .`col_int_key` WHERE table1 .`pk` IN ( 4 )
> ORDER BY field3
>
> vs.
>
> SELECT table2 .`col_int` field3 FROM i table1 JOIN b table2 ON table1
> .`col_int_key` = table2 .`col_int_key` WHERE table1 .`pk` IN ( 4 ) /*
> TRANSFORM_OUTCOME_UNORDERED_MATCH */
>
> Produces this diff of result sets:
> # 2010-05-26T15:38:30 --- /tmp//randgen10880-1274902710-server0.dump
> 2010-05-26 15:38:30.930254321 -0400
> # 2010-05-26T15:38:30 +++ /tmp//randgen10880-1274902710-server1.dump
> 2010-05-26 15:38:30.930254321 -0400
> # 2010-05-26T15:38:30 @@ -1 +0,0 @@
> # 2010-05-26T15:38:30 -7
>
> Here we can see what happened (the full result row):
> SELECT table2 .`col_int` field3, table1.col_int_key, table2.col_int_key,
> table1.pk FROM i table1 JOIN b table2 ON table1 .`col_int_key` =
> table2 .`col_int_key` WHERE table1 .`pk` IN ( 4 ) ORDER BY field3;
> +--------+-------------+-------------+----+
> | field3 | col_int_key | col_int_key | pk |
> +--------+-------------+-------------+----+
> | 7 | NULL | NULL | 4 |
> +--------+-------------+-------------+----+
>
> That JOIN should not have happened. If we don't have the ORDER BY clause,
> the query properly returns 0 rows.
>
> EXPLAIN output:
> query with ORDER BY
> # |1|SIMPLE|table1|const|PRIMARY,col_int_key|PRIMARY|4|const|1|Using
> filesort|
> # |1|SIMPLE|table2|ref|col_int_key|col_int_key|5|const|1|Using where|
> # */
>
>
> query without ORDER BY
> # |1|SIMPLE|table1|const|PRIMARY,col_int_key|PRIMARY|4|const|1||
> # |1|SIMPLE|table2|ref|col_int_key|col_int_key|5|const|1||
> # */
>
> To unsubscribe from this bug, go to:
> https://bugs.launchpad.net/drizzle/+bug/586051/+subscribe
>

PrafullaT (prafulla-t)
Changed in drizzle:
status: Confirmed → Fix Committed
Revision history for this message
Patrick Crews (patrick-crews) wrote :

Tested the patch with the randgen (result set + execution time comparisons).
Bug is fixed via the patch - randgen testing looked good.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.